The complete chloroplast genome sequence of Convallaria majalis L.

Abstract The complete chloroplast genome of an important medicinal plant, Convallaria majalis Linnaeus, was sequenced for the first time. The entire circular genome is 162,218 bp in length, with 37.9% GC contents. The genome has consisted of a large single-copy region (LSC) with a length of 85,417 bp, a small single-copy region (SSC) with a length of 18,495 bp, and two inverted repeat regions (IRs) with a length of 29,153 bp each. The genome harbored 133 genes, including 87 protein coding genes, 38 tRNA genes, and eight rRNA genes. The phylogenetic tree of 24 plant species was constructed based on the maximum-likelihood method. This study will provide theoretical basis for further study on plant genetics phylogenetic research.


Convallaria majalis Linnaeus
, a perennial herb of Asparagaceae, is widely distributed in Asia, Europe and North America, growing in wet places such as shady forest or ditch according to the record of Flora Reipublicae Popularis Sinicae (FRPS). It contains many cardiac glycoside components, which provide cardiotonic and diuretic effects. Furthermore, one study in 2014 showed that odorous components derived from C. majalis constituted more than 20% of perfume raw material market (D€ orrich et al. 2014). High content of aromatic oil in C. majalis possesses sweet and elegant fragrance, making it widely used in the production of soap and cosmetics. In addition to its ornamental and pharmaceutical values, C. majalis is known as a toxic plant. Tissue factor expression induced by saponins and various cardiac glycosides in C. majalis contributes to the development of a hypercoagulable state, which often led to plant poisoning among children in Finland (Lamminp€ a€ a and Kinos 1996;Morimoto et al. 2021). Current studies show that steroidal glycosides derived from C. majalis possessed cytotoxicity to human lung adenocarcinoma cells and thus can be a potential agent for anti-lung cancer (Matsuo et al. 2017).
According The chloroplast genome length of C. majalis was 162,218 bp. The genome harbored 133 genes, including 87 protein coding genes, 38 tRNA genes, and eight rRNA genes, with a GC content of 37.9%. The genome has consisted of a large single-copy region (LSC) with a length of 85,417 bp, a small single-copy region (SSC) with a length of 18,495 bp and two inverted repeat regions (IRs) with a length of 29,153 bp each. Additionally, we find that 15 genes, including trnK-UUU, rps16, trnG-UCC, atpF, rpoC1, trnL-UAA, trnV-UAC, petB, petD, rpl16, rpl2, ndhB, trnI-GAU, trnA-UGC and ndhA, each of which contain one intron, clpP and ycf3 genes contain two introns, and rps12 gene has trans splicing.
Twenty-three species of plant and the outgroup Crocus sativus were selected for complete CP genome phylogenetic analysis according to the Bayesian information criterion (BIC). Maximum likelihood (ML) tree was constructed with the model TVM þ FþR3 (Figure 1) by IQ-TREE 1.6.12 software (Nguyen et al. 2015) (bootstrap value 1000). The phylogenetic tree shows that C. majalis and Convallaria keiskei are sister groups. Crocus sativus, as an outgroup, is far from the other species. The species of each genus are clearly distinguished, except for Ophiopogon bodinieri, which is closer to the genus Liriope, since Liriope spicata and Liriope muscari were once separated from the genus Ophiopogon.
In conclusion, the complete CP genome of C. majalis was determined in this study, which provides theoretical foundation for further study on the phylogenetic relationship of Asparagaceae family.

Author contributions
W.X.M., Y.Y.S., C.B., and Y.P.X. carried out the sampling and analyses; T.G.K. and M.X. were involved in validation and supervision; All authors contributed to the design of the work, the analysis of data for the work and draft revising. All authors finally approved this version.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Data availability statement
The genome sequence data that support the findings of this study are openly available in GenBank of NCBI at (https://www.ncbi.nlm.nih.gov/) under the accession NO. OK448481. The associated BioProject, SRA, and Bio-Sample numbers are PRJNA769782, SRX12547454 and SAMN22169498, respectively.